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Abstract. This paper introduces a special section of 
the STTT journal containing a selection of papers that 
were presented at the 7th International SPIN workshop, 
Stanford, August 30 - September 1, 2000. The workshop 
was named SPIN Model Checking and Software ’Verifica- 
tion, with an emphasis on model checking of programs. 
The paper outlines the motivation for stressing software 
verification, rather than only design and model verifi- 
cation, by presenting the work done in the Automated 
Software Engineering group at NASA Ames Research 
Center within the last 5 years. This includes work in 
software model checking, testing like technologies and 
static analysis. 


1 Introduction 

This special section contains a selection of five papers 
that were amongst the 17 papers and six invited talks 
and tutorials presented at the 7th International SPIN 
workshop, arranged at Stanford University, California, 
USA, August 30 - September 1, 2000. The original pro- 
ceedings were published in Lecture Notes in Computer 
Science volume 1885, Springer, titled: SPIN Model Check- 
ing and Software Verification. Model checking is a tech- 
nique for exploring all possible execution sequences of a 
system of interacting concurrent components. Such sys- 
tems may interact in unexpected ways due to unpre- 
dictable speeds-of-the-various components,-and- are -hence 
extremely difficult to test using traditional testing tech- 
niques. The many ways components can interact usu- 
ally leads to a large search space, and model checkers 
typically incorporate various techniques for conquering 
this complexity. The SPIN model checker [36], for which 
Gerard Holzmann recently received the ACM Software 
Svstem Award, has a large user community, and the 


SPIN workshop is a forum for this community, and gen- 
erally for researchers with interest in automata-based, 
explicit state model checking technologies for the analy- 
sis and verification of asynchronous concurrent and dis- 
tributed systems. The first SPIN workshop was held in 
October 1995 in Montreal. Subsequent workshops were 
held in New Brunswick (August 1996), Enschede (April 
1997), Paris (November 1998), Trento (July 1999), and 
Toulouse (September 1999). 

Traditionally, the SPIN workshops present papers on 
extensions and uses of SPIN. As an experiment, SPIN 
2000 was broadened to have a slightly w r ider focus than 
previous workshops in that papers on software verifica- 
tion were encouraged, as reflected in the name of the 
workshop: SPIN Model Checking and Software Verifica- 
tion. In this paper we shall try to explain the background 
for emphasizing software verification. We will do this by 
outlining in the following sections some of the research 
that has taken place in our own verification research 
group at NASA Ames Research Center throughout the 
last years since its start in 1997, together with some 
thoughts on the future. The verification group is part of 
the Automated Software Engineering (ASE) group, the 
purpose of which is to develop software technology for 
supporting software development within NASA. The se- 
lected papers will be introduced and related to this work 
in special subsections throughout the presentation. 

By software verification we mean model checking of 
source code (or the corresponding object code it is com- 
piled into). This is in contrast to analysis of designs or 
models- of-software,- which- are -usually much more-ab- 
stract. That is, we suggest to focus attention on the real 
beast in all its complexity. Although this view at the 
time of writing seems to have caught on as a popular re- 
search topic, at the time leading up to the workshop, this 
subject was only investigated by few research groups, in- 
cluding our own. Amongst the other work in this domain 
at the time was [4] and [11], and in fact only [4] v*as 
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known to us when we started. Now SPIN has a C inter- 
face [37] and can hence model check C programs, and 
other tools exist as will be elaborated in later sections. 

Although targeting source code may appear as just 
worsening the problem of state space explosion usually 
associated with model checking, we believe that there are 
some benefits from such an approach, as we shall out- 
line here. Note, that we do not suggest that design or 
model verification is uninteresting - far from. However, 
our experience throughout several experiments during 
1996 and 1997 at NASA as well as with a Danish au- 
dio video company lead to the (folklore) conclusion that 
programmers often write code without first writing a de- 
tailed design. We concluded that if formal methods were 
to be adopted at NASA within a shorter time frame, we 
would have to provide a technology that could analyze 
real programs. 

One can argue that programmers should be urged to 
write formalized designs that can be analyzed. However, 
a point of view may be that in order for a design to con- 
tain enough information to be useful for formal analysis, 
the design may approach the final system in complexity, 
in which case programmers will avoid the extra work and 
just write the code directly. This fact may be the reason 
why software developers do not create detailed designs 
as do engineers in other disciplines. The distance for ex- 
ample between a design of a bridge, and the bridge itself, 
is enormous, and therefore the design is well motivated. 
In case the code is generated from the design, the design 
becomes the code, and we are left with code analysis any- 
way. Even a mainly graphical design language such as 
UML raises the issue of program verification since UML 
designs can contain code fragments, and can evolve into 
fully fledged programs. 

This new trend brings new challenges into focus, such 
as dealing with object oriented dynamic memory alloca- 
tion and garbage collection, program libraries, and, last 
but not least, an increased state space to explore. These 
problems require new approaches, amongst them per- 
haps the most challenging being how to deal with really 
big state spaces. Techniques to deal with this include 
for example static analysis, abstraction, guided search, 
and intelligent testing techniques in between complete 
state space exploration such as model checking on the 
one hand, and partial search, such as simulation on the 
other. We believe that this is an interesting research di- 
rection for the formal methods community for the follow- 
ing reasons. First of all, if tools can handle real programs, 
the user community will increase dramatically. Second, 
programming- languages -often offer- quite convenient no- 
tations for expressing solutions to problems, compared to 
modelling languages. Third, by trying to handle real pro- 
grams, the scalability issue becomes much more press- 
ing, and will therefore spawn new research to develop 
more scalable solutions that can even help in design ver- 
ification. Fourth, and finally, researchers from different 

omnnc that mnrlpl rhprVprs for t.hf^ same DtO- 


gramming language will be able to exchange examples 
and compare technologies very easily. 

The following sections proceed as follows. In Section 
2 we describe a case study where SPIN was applied to the 
analysis of a space craft controller, successfully identify- 
ing several errors. This and other previous case studies 
lead to the development of the Java PathFinder I sys- 
tem: a translator from Java to the PROMELA language 
of SPIN, described in Section 3. This system allows to 
model check programs written in a non-trivial subset 
of Jam. Section 4 describes another case study where 
SPIN was applied to analyze an avionic real-time oper- 
ating system. Java PathFinder 1 was limited in the sense 
that it could not handle the Java libraries well. Translat- 
ing the libraries would give too large PROMELA models 
and writing stubs for them would require an enormous 
amount of work. Hence it was decided to model check 
Java byte code instead, based on a homegrown Java Vir- 
tual Machine. This effort is described in Section 5. Sec- 
tion 6 identifies some technologies that are regarded as 
essential to make model checking of software work. This 
includes topics such as abstraction and search heuris- 
tics. One of our more recent research topics is runtime 
verification, as described in Section 7, where scalability 
is achieved by just examining single execution traces. 
Lastly, some final thoughts are given in Section S. 

2 The Remote Agent Example 

2.1 Description of the Remote Agent 

The first verification case study that was performed in 
the newly created Automated Software Engineering group 
at NASA Ames was the application of the SPIN model 
checker to analyze part of the Remote Agent space craft 
controller [29,28]. The Remote Agent is a software sys- 
tem based on artificial intelligence techniques such as 
planning and scheduling. It is meant to execute on board 
the space craft and it's purpose is to take over part of the 
operations normally carried out on ground during the 
operation of a space craft, thereby relieving ground per- 
sonal from micro-managing the space craft, and instead 
focus on higher level goal management. The Remote 
Agent was tested on board the Deep-Space 1 space craft 
during May 1999. The space craft itself was launched 
on October 24, 1998. It was the first demonstration of a 
complete take over of a space craft by an artificial intel- 
ligence based software system in NASA’s history. 

The-Remote Agent-consists essentially of three mod- 
ules: a Planner , an Executive , and a Diagnosis module. 
The standard operation of the space craft using this sys- 
tem may proceed as follows: a goal is created by ground 
personal, for example “move towards the comet and take 
a picture” , and up-linked to the space craft. The plan- 
ner on board will then from this goal generate a plan 
usinsr a set of sophisticated search algorithms, based on 
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a static predefined model of what the possible transi- 
tions are relative to a current state. The result is a plan 
specifying a sequence of tasks for each relevant compo- 
nent on board the spacecraft, that must be performed in 
succession in order to achieve the goal. Tasks from differ- 
ent components may run in parallel according to certain 
time constraints generated as part of the plan. The plan 
is then sent to the Executive, which executes the plan, 
thereby operating the space craft. The diagnosis module 
constantly monitors the behavior of the craft and com- 
pares the observed behavior to the expected, signaling 
the executive, or in worst case the planner, if something 
goes wrong, whereafter proper action can be taken to 
repair the situation. 

The Executive was selected for the verification case 
study, and in particular the language named EsI (Exec- 
utive Support Language), implemented as an extension 
to multi- threaded Common Lisp for supporting the ex- 
ecution of tasks. EsI is essentially an API for multi-task 
programming similar to POSIX threads, but with extra 
domain specific functionality. 


This would cause the executing thread to execute the 
condition no_new_events(), and in case it evaluated to 
true, decide to go to sleep. However, if a new event oc- 
curred in between the condition and the actual call of 
goto_sleep, the thread would miss the new event and 
just go to sleep. 

The programmer of the system was very impressed 
by these results, as documented in [29]. A.s an interest- 
ing aftermath, when the Remote Agent was activated on 
May 18, 1999, an anomaly occurred: thrusting did not 
turn off as requested. The experiment was immediately 
terminated from ground and put in stand-by mode for 5 
hours until the reason for the error had been detected. It 
turned out to be a missing critical section around a piece 
of code similar the one above, but in a different part 
of the system that had not been analyzed with SPIN. 
One thread would then block, missing an event, and the 
whole system would eventually deadlock. We had hence 
demonstrated to NASA that model checking successfully 
can find errors that can damage a mission. 

2.3 Lessons Learned 


2.2 Model Checking 

The EsI module consisted of approximately 3000 lines 
of code. Initially we had a choice between various possi- 
ble verification tools, mainly theorem provers and model 
checkers. We quickly decided that theorem proving would 
be too time consuming for an experiment limited to a 
couple of months of duration, and our goal was to find 
errors, and not to prove complete correctness. We de- 
cided to use SPIN since it already had a programming 
language like syntax and since it allowed dynamic pro- 
cess creation, one of the features of the system. 

From the 3000 lines of LISP code we extracted ap- 
proximately 500 lines of PROMELA code, represent- 
ing an abstraction of the original system. The abstrac- 
tion was made based on informal reasoning, focusing 
attention on a lock table that all threads were access- 
ing. By asking engineers, two properties were formu- 
lated in SPIN’S Linear Temporal Logic (LTL) and ver- 
ified against the model. Neither of the two properties 
turned out to be satisfied by the model, and a total of 4 
classical concurrency errors were revealed, each of which 
had a counterpart in the original code, as confirmed by 
the programmer. They were classical concurrency errors 
in the sense that they could occur due to totally unex- 
pected interleavings of tasks, interleavings that had not 
been-detected by traditional testing. As an example, one 
of these violations was caused by a missing critical sec- 
tion around a piece of code of the form: 

if (no_new_ events () ) 
goto_sleep () 


The experiment was regarded as successful by all in- 
volved parties. Not only had 4 errors been found that 
were very hard to find with normal testing, one of these 
actually also demonstrated a major design flaw in the 
system. Furthermore, one of the errors was later rein- 
troduced in another sibling module, causing deadlock 
during flight that put the space-craft in stand-by mode 
for several hours. 

However, observing the verification process, the re- 
sult was not so encouraging. Twelve man-weeks (two re- 
searchers during 6 weeks) were spent on creating the 500 
line PROMELA model from the 3000 lines of LISP code. 
The LISP code was undocumented and used many layers 
of macros, which made it difficult to read. Just under- 
standing the code in order to make a proper translation 
was definitely one of the problems. A second problem 
was to define the mapping from the very powerful LISP 
language to the less powerful PROMELA language. A 
third problem was to decide what parts to translate and 
for those parts, whether the translation should be one-to- 
one, or some abstraction. It was clear to us that the first 
two problems (understanding and translation) were the 
hardest, while the abstraction problem strangely enough 
was less of a problem. This gave us the hope that if 
the translation could be automated, and the verification 
was performed by the programmer himself, using some 
kind-of semi-automated abstraction support toolj then 
the experiment could potentially have been done within 
a single day. 

Another important source of experience supporting 
the construction of a software model checker was the ap- 
plication of the UPPAAL real-time model checker [39] 
to analyze two audio video systems developed by the 
Danish audio video comnanv Bansr & Olufsen 126.25). 
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The verification effort was very successful in one occasion 
([26]) in that a 10 year old known, but unexplained, bug 
was found and explained. However, as was the case with 
the Remote Agent study, some time was spent on manu- 
ally creating a model from the program, in one case 2500 
lines of assembler code. As a result of these experiences 
we decided to create a translator from a programming 
language to PRO MELA, as described in the next sec- 
tion. The first idea of developing a Java model checker 
was in fact conceived during the work with UPPAAL in 
1997. 

3 Java PathFinder 1 
3.1 Rationale 

As outlined in the previous section, a series of experi- 
ments with applying existing model checkers in the mod- 
elling and analysis of software systems had lead to the 
observation that it would be extremely useful if model 
checkers could analyze programs written in traditional 
programming languages. We therefore decided to de- 
velop a model checker for some well chosen programming 
language, and the choice fell on Java. 

There are several objective reasons for choosing Java. 
First, it was viewed as important that the chosen lan- 
guage was object oriented since that was the current 
trend in programming language design. Second, the lan- 
guage should also be popular in order to gain a broader 
user community. These criteria ruled out C (not object 
oriented) and LISP (not popular). C++ was regarded as 
too complicated for formal analysis due to its rich syn- 
tax and capabilities for operating with pointers. Java was 
hence the obvious choice for the above reasons. However, 
NASA currently operates mostly in C, and in some cases 
in C++. LISP was only used for the Remote Agent ex- 
periment and has been abandoned for future missions. 
This gave us the burden of arguing going for Java. Our 
response would be that Java would be good for proto- 
typing the ideas, and potentially Java could become the 
language of the future. As it turns out, experiments are 
currently undertaken within NASA to evaluate Java as 
a possible replacement of C and C++. The occurrence 
of Real-Time Java may have an important role to play 
in this decision. 

The development of a model checker for Java could 
again take a number of avenues. One could either w r rite a 
model checker from scratch for Java, or write a translator 
ffom-Java to the modelling language of- some existing 
model checker. The SPIN model checker was early on 
regarded as either the target for a translation, or at least 
an example of how one could write a new model checker 
for a programming language. The PRO MELA language 
has a high resemblance to a programming language. One 
of the salient features of PRO MELA is the capability of 
dynamic orocess creation. We early on imagined that 


this could be used to model dynamic thread creation as 
existing in Java. 

It was finally decided to write a translator from Java 
to PROMELA, the modelling language for SPIN, since it 
would potentially require less work than writing a model 
checker from scratch. The project w r as named Java 
PathFinder (JPF) [30], later to be named Java PathFinder 
1 (JPFl), after the Mars PathFinder rover that explored 
Mars in 1997. The goal was to produce a prototype rel- 
atively fast in order to evaluate the potential of model 
checking real programs. At the time, only a source-to- 
source code translation was considered. Java source code 
is compiled into byte code by the compiler, and hence an 
alternative approach would have been to translate byte 
code to PROMELA. This latter approach was, however, 
hardly considered at the time, possibly reflecting a belief 
that byte code verification wx>uld be too inefficient with 
too many detailed interleavings between single byte code 
instructions. As it turned out, as described in Section 5, 
when we later concluded lessons learned from the JPFl 
project, byte code verification actually turned out to be 
a very viable solution. 

3.2 Design and Implementation 

JPFl translates a Java program into a PROMELA model. 
The Java program can contain assertions as calls to an 
assert method, which will be translated into calls of 
PROMELA ’s assert statement. The resulting PROMELA 
model can then be checked for assertion violations and 
deadlocks. There is also a possibility, of course, to check 
general LTL formulae on the resulting PROMELA model, 
although this requires some minimal knowledge about 
the generated PROMELA code. Error traces produced 
by SPIN are visualized using SPIN’S message sequence 
charts, assuming that special print statements have been 
inserted into the code. JPFl does not apply any analysis 
to reduce the state space of the generated model. Hence, 
the Java program must have a finite and tractable state 
space. 

The translator is developed in LISP, and comprises 
6000 lines of code. We have used an already existing 
parser front-end written in Moscow- ML by Peter Ses- 
toft (the Royal Veterinary and Agricultural University in 
Denmark), ported from a Standard- ML version written 
by Olivier Brunet and Gordon Woodhull (University of 
California, Berkeley, USA). The parser handles Java 1.0, 
an early version of Java. As a result, the translator trans- 
lates a subset-of- Java-l-.O: However, a- significant subset of 
Java 1.0 is supported by JPFl. This includes: class def- 
initions with class variables, fields and methods; simple 
data types such as integers, booleans, object references 
and arrays of all these types; class inheritance; dynamic 
object creation; threads and synchronization primitives 
such as synchronized statements and the wait and 
notify methods; exceptions and thread interrupts; and 
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finally most of the standard programming language con- 
structs such as assignment statements, conditional state- 
ments and loops. Amongst the features not translated 
are: packages (the parser could only read one package), 
overloading, method overriding, recursion (since method 
calls are translated by inlining), strings, floating point 
numbers, some thread operations like suspend and resume, 
and some control constructs, such as the continue state- 
ment. Furthermore, arrays are not objects as in Java 
since they are modelled using PROMELA 5 s own arrays 
to obtain an efficient verification. Finally, but perhaps 
most importantly, the translator can not translate the 
pre-defined class library, including for example numer- 
ous container classes. In spite of these omissions, JPFI 
at the time translated more of Java than any to us known 
other similar tool. 

A key design issue was how to translate dynamic 
object creation. Dynamic object creation is handled by 
for each class to define an array of fixed size, each en- 
try of which corresponds to the data area of the class. 
Hence, for example if a class has two variables x and y, 
then an array of records containing these two variables 
is generated. The size of the array sets a limit on how 
many objects of the class can be generated, and must 
be re-defined by the user if the default value is not sat- 
isfactory. An index variable always points to the next 
free object. An object reference is a pair consisting of 
the name of the class and an index variable pointing 
into the corresponding array. Another key issue was the 
translation of dynamic thread creation and the various 
thread synchronization constructs. Threads are natu- 
rally mapped to PROMELA processes. The key synchro- 
nization constructs, such as the synchronized methods, 
the synchronized statement, and wait and notify, are 
handled by introducing extra variables in the data area 
for each object (in the array corresponding to the class). 

For example, locking an object is modelled by intro- 
ducing a LOCK variable, which by default contains null, 
and which is assigned the thread id of any thread lock- 
ing the object. Another thread cannot access the ob- 
ject in case this variable differs from null. Similarly, 
a PROMELA zero-capacity (synchronous) channel vari- 
able is introduced to model the wait and notify opera- 
tions: waiting corresponds to executing a ”?” operation 
on the channel and a notification corresponds to execut- 
ing a A major feature of the translator is that it 
can handle exceptions and the finally construct. Ex- 
ceptions are translated by using the unless construct 
of PROMELA 1 A special variable EXN is introduced in 
each .thread object, holding the default null value. An 
exception (which is an object in Java) is thrown by stor- 
ing the exception object into this variable, which again 
triggers the surrounding uni ess-constructs, which are of 
the form P unless EXN ! - null. 

1 Gerard Holzmann introduced a special -J (for Java) option in 

SPIN to interpret unless from inside-out rather than from outside- 
in in order tn malp it n^pfnl for this translation. 


3.3 Lessons Learned. 

JPFI was considered a successful tool, achieving atten- 
tion from various research groups. The tool was applied 
to a game server consisting of 1400 lines of Java code in 
16 classes [34] . Although the example was not very big, it 
was non-trivial, and not written with formal verification 
in mind. A suspicion about a deadlock in the system was 
confirmed using JPFI. The tool was also applied to an- 
alyze the Remote Agent after it deadlocked in space, as 
described in [28]. In this case the space craft engineers at 
JPL in Los Angeles informed us that a deadlock had oc- 
curred and challenged us if we could find the error using 
model checking. We did find the error, however discov- 
ering it though code review since we had seen it before 
as described in Section 2. However, JPFI was used to 
confirm that it was an error. 

It was clearly felt that smaller Java programs of up 
to 2000 lines of code could be handled with this kind 
of technology 2 . This could either mean that the tech- 
nology was well suited for unit testing, or perhaps for 
testing even larger systems using abstraction before the 
application of the tool. However, the tool itself had some 
drawbacks concerning its applicability. As described ear- 
lier, although a considerable subset of Java was trans- 
lated, not all was translated, and in particular not the 
pre-defined Java library. It was regarded as impractical 
to translate the library using JPFI (even if we had the 
sources). Hence, a program would have to be modified in 
order to fit the well-formedness criteria of the translator 
if it used the library, and most Java programs do. Also, 
there were other translation omissions, such as recursion, 
that seemed hard to capture considering the then exist- 
ing translation framework. In general, it was the percep- 
tion that the closer one got to cover 100% of Java, the 
harder it became to extend the translator. As it turned 
out, working at the byte code level would solve all these 
problems, without costing a big loss of efficiency. 

4 The DEOS Case Study 

In 1998 Honeywell Technology Center approached the 
ASE group with a request to investigate techniques that 
would be able to uncover errors that testing is not well 
suited to catch [43]. The next generation of avionics plat- 
forms will shift from federated system architectures to 
Integrated Modular Avionics (IMA) where all the soft- 
ware runs on a single computer with cooperating sys- 
tem ensuring time and space partitioning between the 
different processes. For certification of critical flight soft- 
ware the FAA requires that software testing achieves 
100% coverage with a structural coverage measure called 

2 Evidently, the complexity of a program cannot be purely mea- 
sured in terms of lines of code, but rather one has to consider the 
amount of interleaving nossible between threads 
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Modified Condition/Decision Coverage (MC/DC). Hon- 
eywell was concerned that 100% structural coverage would 
not be able to ensure that behavioral properties like 
time-partitioning will be satisfied. In particular, they 
developed a real-time operating system, called DEOS, 
where an error in the time partitioning of the O/S was 
not uncovered by testing. As an experiment the ASE 
group undertook the challenge of finding this error with 
a model checker without knowing what it was, where it 
was, or even, how to check for it. In a kick-off meeting 
Honeywell visited the ASE group and discussed the ba- 
sic functionality of DEOS, and subsequently produced a 
slice of the O/S that contained all the code required to 
show the error. The code that was analyzed was 1500 
lines of C++ code (full DEOS is 10000 lines of code). 

Since we didn’t have a model checker that could take 
C++ as input we were forced to again translate the code 
to a suitable model checker’s input notation. However, 
unlike with the Remote Agent we decided to do a me- 
thodical 1-to-l mapping between the code and the model 
checker input, so that we could avoid first understand- 
ing all of the program. We again chose the SPIN model 
checker since the PROMELA language was the closest 
model checker input to a real programming language, 
like C++. The translation scheme we used was based on 
the Java PathFinder 1 approach for dealing with object 
oriented programs (see Section 3). 

The error was found in 3 man-months work: divided 
between 1 man-month translating the C++ code to 
PROMELA and 2 man-months finding the error. In the 
Remote Agent case it took 3 man-months to translate 
the code, and two man-weeks to do the analysis. This 
difference can easily be explained by the differences in 
the two systems: one system was nearing the end of its 
design cycle, written to be certified, tested thoroughly 
and contained only one very subtle error (DEOS), the 
other was in the middle of its development cycle, written 
in a semi-research environment, tested by the developers 
and contained a number of errors (Remote Agent). 

The analysis of the DEOS system was very well re- 
ceived by Honeywell and subsequently the DEOS system 
became the focus of a number of research efforts [43,50, 

17]. Honeywell proceeded in creating their own model 
checking team to analyze future DEOS enhancements as 
well as the applications to run on top of DEOS [7]. Hon- 
eywell is continuing to extend the DEOS PROMELA 
model to support verification of more complex versions 
of DEOS. 


4.1 Lessons Learned 


From a research perspective the work on DEOS validated 
our hypothesis that real programs can be anafyzed di- 
rectly, however DEOS also showed us some other prob- 

lpmG* 


— Model checking programs directly shifts the burden 
of work from the translation of the code to the model 
checker’s input to the analysis of the code. 

- Typically the translation from code to model checker 
involves some ad-hoc abstraction and slicing of 
the code, that makes the model checking more 
efficient. 

— When this translation is done 1-to-l it means 
much of this clever encoding that was previously 
done by the human translator now needs to be 
done by clever tools with minimal human input. 

— Creating an environment for the program to execute 
in during model checking can be very challenging 

- Model checkers can only analyze closed systems 
hence any system to be analyzed must be supplied 
with an environment to drive it. This is analogous 
to creating a test-driver and selecting test cases 
to support testing. 

— Creating an environment for DEOS to show the 
error occurring took the most time in the DEOS 
model checking (2 man-months). 

4.2 Related Paper in this Special Section 

Traditionally the SPIN workshop has had a strong focus 
on the use of SPIN in real-world case studies - similar to 
the DEOS case study described here. In keeping with this 
tradition the paper by Brinksma, Mader and Fahnkar, 
entitled Verification and Optimization of a PLC Con- 
trol Schedule describes the use of SPIN (as well as UP- 
PAAL [39]) for the analysis of a programmable logic con- 
troller system. What makes this contribution novel is 
that firstly the PLC controller is a real-time system and 
SPIN doesn’t support real-time directly (the paper also 
describes a comparison study with UPPAAL that does 
support real-time), and secondly, that not only correct- 
ness properties of the controller are considered but also 
optimization issues in the use of the controller. One of 
the contributions of this paper is the use of variable time 
advance for handling real-time in SP IN , we adopted this 
approach also in the analysis of DEOS. 


5 Java PathFinder 2 

5 . 1 Rationale 

As pointed out in Section 3 the Java Pathfinder 1 (JPF1) 
model checker was highly successful, but had a num- 
ber— of- drawbacks that limited its effectiveness: Essen- 
tially the main reason for this was the translation based 
approach adopted: although SPIN is a very powerful 
model checker and the PROMELA language very expres- 
sive, the mapping between Java and PROMELA is not 
straight forward. Java PathFinder 2 (henceforth JPF2) 
was developed to address the shortcomings of JPF1 (see 
Section 3): 
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1. Handle all the language features of Java 

2. Handle Java libraries 

3. Allow more flexible approaches to model checking 

Java programs 

The major design decision for JPF2 was to base it on 
a custom-made Java Virtual Machine (JVM) that could 
execute all Java bytecodes. This addressed issues 1 and 2 
from above, since all of Java could now be model checked 
and also all Java libraries. We addressed the third issue 
by designing JPF2 in a modular fashion in order to allow 
many different search strategies to be easily integrated 
into the model checker. 

A number of different research groups have worked on 
Java model checkers, but most of these have been based 
on the translation approach as used for JPF1 [11,9]. To 
date, JPF2 is still the only model checker that can handle 
all the language features of Java. The only other custom- 
made model checkers that address real programming lan- 
guages are, dSPIN [12] an extension of SPIN that can 
handle dynamic memory creation and functions, the new 
version of SPIN that can handle a subset of C, and the 
SLAM model checker [1] that checks reachability prop- 
erties of sequential C programs. 

5.2 Design and Implementation 

JPF2 is written in Java which made the development of 
a custom-made JVM quite easy - one could exploit the 
fact that we were doing “Java-in- Java” by allowing the 
underlying JVM to handle the implementation of some 
of the tricky bytecodes such as floating point division 
(FDIV) . We believe that since we wrote JPF2 in Java, 
it contributed to the fact that a prototype system that 
had similar functionality as JPF1, w T as completed in only 
3 man-months. 

JPF2 is an explicit-state model checker which means 
it enumerates each reachable system state from the ini- 
tial state and in order to not redo work (and therefore 
terminate) it is required to store each reached state. 
When analyzing a Java program each state can be very 
large and thus require much memory to store, hence re- 
ducing the size of systems that can be handled during 
model checking. This was the fundamental problem that 
had to be solved for JPF2 to work. Others considered 
this problem too hard and developed so-called state-less 
model checkers (i.e. they don’t store states and there- 
fore do a partial state- space search) [20]. In JPF2 this 
problem is solved by using novel state-compression tech- 
niques [41] that reduce the memory requirements of the 
model-checker by an order of magnitude-.- Another novel 
feature of JPF2 is the use of symmetry reduction tech- 
niques to allow' states that are the same modulo where 
an object is stored in memory to be considered equal 
[41]. Since, object-oriented programs typically make use 
of many objects, this symmetry reduction often allows 
an order of magnitude less states to be anafyzed in a 


JPF2 uses the BANDERA [9] toolset for specifying 
the properties to be analyzed, the display of the error- 
path if one exists, as well as for certain forms of abstrac- 
tions and slicing. BANDERA supports the specification 
of predicates within Javadoc comments that can be used 
to check linear temporal logic (LTL) behavioral proper- 
ties as well as pre- and postconditions for methods. To 
handle LTL properties JPF2 has a front-end translator 
from LTL to Biichi- automata [19] that is highly opti- 
mized to produce succinct automata. The JPF2 model 
checking algorithm then checks whether all program be- 
haviors comply with the behaviors described by the au- 
tomata, using a highly optimized algorithm based on the 
work presented in [51]. 

JPF2 also supports distributed memory model check- 
ing, where the memory required for model checking is 
distributed over a number of workstations [41]. Although 
this technique requires an additional time-overhead due 
to the sending of messages over a network, it allows ex- 
amples to be analyzed that previously would not fit in 
the memory of one workstation. The crucial factor for 
the success of distributed model checking in this fash- 
ion is how to partition the memory across the different 
workstations — in [41] we investigated a number of par- 
titioning schemes and found that dynamic partitioning 
(partitions evolve during model checking rather than be- 
ing statically fixed at initialization) worked best. 


5.3 Lessons Learned 

JPF2 has been successfully used in a number of projects, 
most notably the DEOS error (from Section 4) was redis- 
covered in a Java translation of the original code. More 
recently, 7000 lines of code from a Mars rover was suc- 
cessfully analyzed. The JPF2 system was made available 
.to the user community via a web download in Febru- 
ary 2001 and since then more than 100 organizations 
have registered to use the tool. More importantly, JPF2 
has had the desired effect of becoming a vehicle for re- 
search on analyzing programs with model checking: we 
have close collaborations with the BANDERA group at 
Kansas State University as well as other groups at CMU, 
Stony Brook, Minnesota, Freiburg and Liverpool Univer- 
sities. 

The development of JPF2 was the culmination of 3 
years of research in the application of model checking to 
software within the ASE group. In many ways how r ever it 
is the- stepping- stone for the future: instead of worrying 
about how to encode a program in some model checking 
notation, one can rather think of the behavioral proper- 
ties one would like to check, which parts of the program 
to abstract to make the model checking more tractable, 
and how to improve model checking for specific classes 
of programs. These issues will all be discussed in the 
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54 Related Papers in this Special Section 

As mentioned in Section 5.2 JPF2 supports LTL model 
checking through the use of the BANDERA toolset to 
describe the properties to be checked on the Java pro- 
grams. In this special section the language for describing 
these properties, namely the BANDERA Specification 
Language (BSL), is outlined in detail in the paper by 
Corbett, Dwyer, Hat cliff and Robby, entitled Express- 
ing Checkable Properties of Dynamic Systems: the BAN- 
DERA Specification Language . The BSL language has 
recently been fully integrated with JPF2. 

An important component of explicit-state model check- 
ing is how to check temporal properties efficiently. JPF2, 
as well as SPIN, uses the so-called automata- theoretical 
approach where each LTL (linear time temporal logic) 
formula is translated to a Biichi automaton before model 
checking commences. This translation from LTL to Biichi 
automata has been the focus of much research, and a 
number of tools for doing such a translation exist (in- 
cluding the one used in JPF2 [19]). However, doing this 
translation efficiently is non-trivial and therefore also 
error-prone. The second paper, by Heikki Tauriainen and 
Keijo Heljanko, entitled Testing LTL Formula Transla- 
tion into Biichi Automata , deals with this, somewhat 
overlooked, area of the correctness of LTL to Biichi trans- 
lators. We will soon by relying on their technique also to 
the test the LTL to Biichi translator used within JPF2. 


6 Technologies for Software Model Checking 

For model checking to make an impact on the quality of 
programs produced the amount of human effort in oper- 
ating the tools should be kept to a minimum. With JPF2 
we have reduced the amount of effort considerably since 
a translation phase is no longer required. However, be- 
cause the automated translation preserves all the details 
of the software implementation, the model checking itself 
is more difficult. The reason is that manual translation, 
typically involves significant optimization and abstrac- 
tion of the system. Therefore, to truly reduce the amount 
of manual effort and place model checking into the de- 
velopment loop, we need tools to support the typical 
optimizations and abstractions previously done during 
translation. In general, the goal is to reduce the, state- 
space of the system that the model checker needs to an- 
alyze, providing both scalability and responsiveness. 

6.1 Abstraction 

When using abstraction techniques to reduce the number 
of states of a system one can either remove some behav- 
iors present in the original system (under- approximations) 
or introduce new behaviors not present in the original 


6.1.1 Under-approximations 

Under- approximation of the behaviors is by far the most 
common form of manual abstraction before model check- 
ing. Under-approximation doesn’t preserve correctness, 
i.e. if the abstract system satisfies a behavioral property 
then it doesn’t follow that the original system does as 
well. Under-approximation are however good for find- 
ing errors, since an error in the abstract system implies 
the same error in the original [50]. JPF2 was built with 
the view that it should cover the spectrum of analysis 
techniques from testing, where only one execution of a 
program is analyzed, to model checking where all the 
paths are analyzed, hence JPF2 supports a number of 
techniques for doing under-approximations during model 
checking (we highlight two below). 

Race-Guided - where a race-analysis is done on the pro- 
gram first and if a race violation is found the model 
checker focuses on the threads involved to see what 
the race violation might lead to. We used this tech- 
nique to find the error in the Java translation of the 
Remote Agent error that occurred during flight [52]. 
Heuristic Search - using techniques from AI we can ap- 
ply either general or program specific heuristics to 
guide the search towards likely errors. For example 
to find deadlocks we use a heuristic that tries to 
maximize blocked threads - this heuristic found the 
Remote Agent deadlock within seconds whereas in 
exhaustive mode the model checker will fail due to 
memory limitations. We also developed a heuristic for 
finding problems that are due to thread- interleaving 
and lastly, one based on trying to increase structural 
testing coverage [22] 

6.1.2 Over- approximations 

With this technique one represents a group of states in 
the concrete (original) program by a small finite set of 
states in the abstract program — and can therefore lead 
to huge state-space reductions. This form of abstrac- 
tion is inspired by abstract interpretation as first used 
in static program analysis [10], where the data domain 
(type) of a variable is replaced by an abstract type over 
which all concrete operations are then interpreted. Note 
that this type of abstraction causes more behaviors to be 
present in the abstract program than in the original pro- 
gram. The fact that more behaviors are possible in the 
abstract program means that if a behavioral property ex- 
pressed-in- LTL holds in the abstract-it- also -holds- in- the 
concrete, but if an LTL property fails in the abstract 
then it might not fail in the concrete (since it might 
fail due to a behavior found in the abstract that is not 
present in the, concrete). Another very popular form of 
over- approximations is called predicate abstraction [21, 
1]: here one replaces a predicate used in the program 
"hxr q Knnlp>m vn.riPi.blp ar^d all imdates to the variables 
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in the predicate are changed to updates of the boolean 
variable. 

JPF2 supports predicate [50] and BANDERA sup- 
ports type- based abstraction [17]. In order to handle 
over-approximations of the program behaviors we have 
extended Java with two special method calls that signals 
nondeterministic choice (random(n) that return values 
between 0 and n inclusive and randomBool() that re- 
true or false) — whenever the model checker en- 
counters these methods it will nondeterministically try 
all possible results for each call. 

Predicate Selection The first problem one encoun- 
ters with the application of over-approximations in prac- 
tice is how to select the parts of the program to abstract 
typically this requires human intervention. In BAN- 
DERA type-based abstraction can be done automati- 
cally though, by doing a backward dependency analysis 
of the program from all points that directly influence the 
temporal property to be checked, to determine a set of 
variables that most influence program behavior (with re- 
spect to the property to be checked) and these variables 
then become candidates for abstraction [17]. Although 
predicate abstraction can be applied automatically too, 
by selecting all predicates in the program and in the 
property, we have found that in practice this leads to too 
many spurious counter-examples (i.e. too many behav- 
iors not present in the original that then lead property 
violations) . 

Abstract Program Creation Both predicate and 
type-based abstraction can be applied either during or 
before model checking. However in practice, the calcu- 
lations required to determine the abstract state of a 
program is too slow to be done during model checking, 
and therefore we only use abstract program creation be- 
fore model checking in JPF2 and BANDERA. In order 
to calculate an abstract operation, given an abstraction 
mapping (type or predicate) and a concrete operation, 
one requires an automated theorem prover (i.e. a set of 
decision procedures for the domain). For predicate ab- 
straction we use the Stanford Validity Checker (SVC) 
[2] to calculate abstract statements and for type-based 
abstraction BANDERA uses PVS [42]. Although most 
of the abstraction calculations are done before model 
checking, object-oriented programs are particularly chal- 
lenging for predicate abstraction, since predicates may 
relate variables from different classes that during exe- 
cution can have a number of instantiations. Predicate 
abstraction is typically done in a static setting, whereas 
with object-oriented programs predicates can be created 
dynamically during- execution when new -objects are in- 
stantiated. JPF2 therefore supports mechanism to allow 
predicates to be created on-the-fly during model check- 
ing when predicates are specified across different classes 
[ 50 ]. 

Result Interpretation The biggest drawback of 
over- approximation based abstractions are that errors 


The more aggressive the abstraction, i.e. the bigger state- 
space reduction one achieves, the more likely it will be 
that a spurious error will occur. It is a well known fact 
that users of systems where spurious errors can be re- 
ported are more likely to complain about the spurious 
errors than if it reported no errors (supported by data 
presented by Microsoft after using their static analysis 
tool PFJEfix for discovering run-time errors [49]). For a 
program model checker using abstraction to be of prac- 
tical use it is therefore vitally important that spurious 
errors be eliminated. JPF2 supports a novel technique 
for achieving this goal: as a first-pass after abstraction it 
only searches the parts of the abstract program s state- 
space that it knows contains no behaviors that are not 
also part of the concrete program [47].. One can view 
this as doing a on-the-fly under-approximation of the 
state-space generated from doing an over-approximation 
of the original program. This technique has been remark- 
ably successful: both the Remote Agent and DEOS ex- 
amples' bugs can be found using abstraction and this 
search technique. 

Abstraction Refinement An abstraction can be 
too coarse in certain situations, i.e. a spurious error can- 
not be removed unless the abstraction is refined. JPF2 
supports a very practical approach to determining where 
a refinement is necessary: the path reported by JPF2 
as a counter-example over the abstract program is exe- 
cuted over the concrete and where the path diverges (if 
it doesn't diverge then of course the abstract path is not 
spurious) the predicates at that point are likely candi- 
dates to refine the abstraction. Refinement then proceeds 
by adding these predicates to the predicate abstraction 
and repeating the abstract program creation. This ap- 
proach was first demonstrated in the Invest tool [3]. 

6.2 Slicing 

Slicing is a technique that yields a precise abstraction 
(neither over nor under approximation) of the program 
behavior with respect to the property being analyzed 
[14]. A sliced program yields a smaller state space than 
the original un-sliced program, and hence, slicing allows 
the model checker to handle larger programs. There are 
two important aspects to selecting statements that will 
be eliminated. First, these statements should not appear 
on the dependence graphs of the statements containing 
variables that are terms of the property being checked. 
Second, the sliced program should still be executable 
(since JPF2 is an explicit-state model checker). Slicing 
in JPF2- is provided through the slicing capability of- the 
BANDERA toolset [9]. In BANDERA, slicing is per- 
formed based on six types of dependencies [24]: three 
intra-thread dependencies which are usually found in se- 
quential programs, namely data , control and aivergence 
dependencies and three types of dependencies ( interfer- 
ence , , synchronization , and ready dependencies) that cap- 
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6.3 Partial- Order Reduction 

The goal of partial order reduction is to exploit the com- 
mutativity of concurrent transitions to reduce the state 
space that needs to be explored by a model checker. This 
technique, which is well described in [5], relies on the 
concept of independent transitions. Two transitions are 
independent if the execution of one does not disable the 
other (and vice versa) ( enabledness condition) and they 
result in the same state regardless of their execution or- 
der ( commutativity condition). JPF2 relies on a stronger 
concept based on safe transitions [38]. In essence, a tran- 
sition is safe if it is independent on any transition of any 
other thread. A partial order reduction scheme that se- 
lects only safe transitions, when some exist, for explo- 
ration is guaranteed to yield correct results. 

From a static analysis point of view, identifying safe 
statements can be reduced to the problem of identifying 
objects that can escape the thread where they have been 
created. Indeed, if we can identify such objects we can 
identify objects that can be shared by different threads. 
Then, unsafe statements are those that access shared 
objects, as well as those that correspond to entering a 
monitor in Java (these ones are easily identifiable syntac- 
tically). Our “safe statement” analysis is essentially an 
aliasing analysis. In a first phase, we build the program 
call graphs associated with each thread. As we build 
these graphs, we identify some escaping objects (they 
are passed as arguments to the class constructor of the 
thread). It is easy to realize that all other escaping ob- 
jects are aliased to the escaping objects identified in the 
first phase. Therefore, the second phase consists of an 
aliasing analysis. Note that we do not have to compute 
aliases created by considering all interleavings (which is 
quite costly). Indeed, all escaping objects are identified 
by computing “intra-thread” aliases. This means that 
the complexity of our analysis is similar to the complex- 
ity of an aliasing analysis for sequential programs. 

6.4 Environment Generation 

An explicit- state model checker, such as JPF2, requires 
a closed system to analyze, i.e. a system and the envi- 
ronment it needs to operate in must be provided before 
model checking [16]. Often, however, the environment is 
not available and it needs to be created during testing 
an analogous problem exists when a test-harness must be 
created, however a few subtle but important differences 
exist. For model checking it is -important that all rele- 
vant environment behavior be present, whereas in testing 
a subset of all possible test-cases will be tested. Know- 
ing which environment actions are relevant is however 
only possible with domain knowledge, something not al- 
ways possible if the domain experts are not involved in 
the model checking (as is almost always the case in a 
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A common approach favored during model checking 
of systems without a known environment is to create the 
most aggressive environment, i.e. one that can perform 
any legal action at any possible time — often referred 
to as the universal environment [15]. If a property holds 
for a system composed with its most aggressive environ- 
ment then the system will be correct when used in any 
environment. This is similar to the case where an over- 
approximation is done during abstraction. Unfortunately 
it also has the same problem as over-approximation in 
abstraction: spurious errors may result since the uni- 
versal environment allows behaviors for v/hich the sys- 
tem was not designed. A novel approach to remove such 
spurious behaviors is by filtering unwanted behaviors 
from the environment using LTL properties augmented 
with filter properties [15]. This technique was success- 
fully used to create the DEOS system’s environment [46] 
in only a few days rather than the 2 months used creat- 
ing the environment manually. 

6.5 Related Papers in this Special Section 

Two of the papers in this special section are related to 
JPF2 as well as the state-space reduction techniques de- 
scribed in this section. 

Firstly the paper by Scott Stoller entitled Model- 
Checking Multi- Threaded Distributed Java Programs ex- 
ploits the specific thread synchronization facilities in Java 
to optimize model checking by improving partial-order 
reductions (see Section 6.3). This work is illustrated within 
the context of doing state-less model checking (see Sec- 
tion 5.2) and is also implemented within JPF2. 

Curbing the omnipresent state-explosion problem has 
been a fruitful line of research within the SPIN com- 
munity as well as the model checking field in general. 
One popular technique for combating the state- explosion 
problem, not highlighted in this section, is to exploit 
symmetry reductions within the system that is being an- 
alyzed. The paper by Bosnacki, Dams and Holenderski, 
entitled Symmetric SPIN, introduces a symmetry reduc- 
tion package for SPIN. The significance of this work lies 
not only in the theoretical contributions, but also in the 
fact that the research ideas were implemented withm 
SPIN and is supported by empirical data. As mentioned 
in Section 5.2 JPF2 also supports symmetry reductions, 
but only for the objects instantiated within the Java pro- 
gram, w r hereas this work also handles symmetries in the 
process structure. 


7 Java PathExplorer 

7.1 Rationale 

Since the Java PathFinder attempts to explore the en- 
tire state space of a Java program, storing me states 
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explicitly, it naturally suffers from the classical state 
space explosion problem. For very large applications one 
may therefore want to apply complementary techniques 
more closely related to traditional testing. Testing can 
be characterized as: ” execute the program with different 
test-cases and observe each execution, comparing it to 
the expected behavior”. Although we believe that the 
area of automated test-case generation has great poten- 
tial, we think that its maturity is still at least 5 years 
out in the future. Also, providing a general, application 
independent, framework for automated test-case gener- 
ation is not obvious. Engineers at JPL, in addition, ex- 
pressed scepticism that such automation could be done, 
suggesting that it always at the end requires some engi- 
neer to sit down and think out what the test case should 
be. Our goal was to develop a technology that had a 
chance of being adopted by space craft designers within 
a relatively short time horizon (a couple of years). Our 
interest hence was turned on the observation part of the 
equation. The question w r as: 

How much information can be extracted about a 

program from observing a single execution trace? 

It was our intention to develop a technology that could 
be applied automatically and to large full-size applica- 
tions, with minimal modification to the code. The SPIN 
2000 workshop hosted two invited talks on two com- 
mercial tools in this category: Temporal Rover [13] and 
Visual Threads [23]. Temporal Rover monitors the exe- 
cution of a program, and checks its behavior against a 
collection of temporal logic formulae written in a tempo- 
ral logic. Formulae are written in the code as comments, 
and then translated into formula checking code, which is 
executed as assertions. Visual threads performs various 
concurrency error analysis, such as deadlock and data 
race analysis. In particular, it implements the Eraser al- 
gorithm [48] for detecting data races. It was decided to 
build a tool, Java PathExplorer (JPaX) [31-33], which 
combined the functionality of these two tools, and in ad- 
dition added new functionality. The Java PathExplorer 
analyzes (explores) single executions traces. 

Visual Threads is tightly coupled to Compaq’s Alpha 
microprocessors, and in addition does not work properly 
on Java programs. Hence, one goal was to port some of 
the technology to work for Java. The Temporal Rover 
required manual instrumentation of the code. We de- 
cided that automated instrumentation is desired, and 
- hence- focused on -providing- that capability; In Temporal 
Rover one can for example state a property over a set of 
program variables. One then has to insert the property 
at each update of these variables manually. With auto- 
mated instrumentation capability, the property checks 
will be ins erted automatically at all updates. This work 
was also inspired by the MAC tool [40], which performs 
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1.2 Design and Implementation 

Two kinds of event analysis are currently implemented. 

Logic based monitoring consists of runtime checking 
formal requirement specifications written in high level 
logics by users of the system. Logics are currently imple- 
mented in Maude [6], a high-performance system sup- 
porting both rewriting logic and membership equational 
logic. One can naturally and easily define new logics 
in Maude, such as for example temporal logics [44], to- 
gether with their finite trace operational semantics. Cur- 
rently, JPaX supports two built-in logics, future time 
and past time linear temporal logic. 

Error pattern analysis consists of analyzing one ex- 
ecution trace using various error detection algorithms 
that can identify error-prone programming practices that 
may potentially lead to errors in some executions. Two 
such algorithms focusing on concurrency errors have been 
implemented in JPaX, one for deadlocks and the other 
for data races: the Eraser algorithm [48]. It is important 
to note, that a deadlock or data race potential does not 
need to actually occur in order for its potential to be de- 
tected with these algorithms. This is what makes them 
very useful in practice. As an example, the deadlock algo- 
rithm works by building a graph of locks acquired during 
the execution, building a edge from a lock LI to a lock 
L2 if some thread T holds LI while acquiring L2. The 
lock graph accumulates all such updates and a 'warning 
is issued if it eventually becomes cyclic. 

An instrumentation module performs a script-driven 
automated instrumentation of the program to be ob- 
served. The instrumented program, when run, will emit 
relevant events to an observer, potentially running on a 
different computer, in which case the events are trans- 
mitted over a socket. The Java byte code instrumenta- 
tion is performed using the powerful Jtrek Java byte code 
engineering tool [8] from Compaq. Jtrek makes it possi- 
ble to easily read Jam class files (byte code files), and 
traverse them as abstract syntax trees while examining 
their contents, and insert new code. 

7.3 Lessons Learned 

At the time of writing, PathExplorer is being applied 
to a couple of case studies at NASA Ames, with so far 
promising results. Deadlocks and data races have for ex- 
ample been located. Deadlock and data race analysis, 
however, is limited, evidently, to only that kind of errors. 
So- although the technology is -powerful-, -it -only-covers a 
smaller fraction of the errors usually contained in soft- 
ware. Temporal logic monitoring can be used to check a 
broader class of errors, although in this case an error has 
to actually occur in order to be detected. Runtime moni- 
toring can potentially be combined with model checking, 
for example as described in our paper in the SPIN 2000 
nroceedin !t27L Here deadlock and data race anaivsis 
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has been integrated into the Java PathFinder tool in 
such a way that one can first run the tool in simulation 
mode where deadlock and data race potentials are de- 
tected in a very scalable manner, whereafter the model 
checker is started to focus in on the threads involved in 
the warnings. A major issue that current case studies 
demonstrate is that it is difficult for software engineers 
to generate requirements that a software system should 
satisfy, even in English. Hence, it is interesting that for- 
malizing the properties, once provided informally, is not 
the main problem. 

8 Summary 

In the previous sections we tried to give a flavor of the re- 
search within the Automated Software Engineering group 
at NASA Ames, that led to the decision to focus the 
7th SPIN Workshop on model checking software. The 
sections related to the different research activities de- 
scribed were given in a roughly chronological order of 
when the work started. The concept of the workshop 
was formulated in late 1999, which would place it some- 
where in the early stages of the JPF2 (Section 5) and 
Java PathExplorer (Section 7) development. These two 
projects as well as the work on supplementary technolo- 
gies for model checking (Section 6) are very much still 
ongoing. 

A number of other projects (in the ASE group) in 
the general field of software verification and validation 
have started since the SPIN 2000 workshop, but since 
these are not directly related to the workshop we only 
briefly mention them here: 

- We use the PolySpace Verifier [45] to check for run- 
time errors in Space Flight software, and have found 
errors in Mars PathFinder code as well as in code 
to run biological experiments on the International 
Space Station. PolySpace is a commercially available 
tool that uses static analysis techniques to discover 
errors. 

— In a joint project with the University of Minnesota we 
are using JPF2 for test-case generation [35]. Within 
the context of this work we are currently extending 
JPF2 with the capability to do symbolic execution. 

We would like to emphasize that we regard program 
analysis as a complementary technique to design anal- 
ysis, and hopefully the two approaches eventually can 
coexist within a unified framework. 
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